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Abstract 

The standard textbook method for estimating the probability of a biased coin from finite tosses implicitly 
assumes the sample sizes are large and gives incorrect results for small samples. We describe the exact solution, 
which is correct for any sample size. 
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1 Introduction 

Consider the following problem. A biased coin, with an unknown probability p of heads, is tossed n times, and 
m heads result. What is the best estimate of p from n and m? 

Problems of this form occur in many applications. A typical example is found in Ref [I] p. 346]. Their 
problem is to determine the percentage of Democratic votes, and the confidence interval for that percentage, 
given that 917 voters in a sample of 1,600 (out of 25,000) are Democrats. 

The solution given in the book is as follows. The observed ratio ,^^1,, « 0.57 is the estimate for the fraction of 

^ 1,600 

Democratic voters. The standard deviation (SD) is estimated by a bootstrap procedure as \/0.57 ■ 0.43 ~ 0.50. 
The standard error (SE) is computed as ^1, 600 ■ SD ~ 20, or 1.25%. The 95% confidence interval is estimated 
as 57±2- 1.25%. 

This method, which we will call the "standard method," does not work when samples sizes are small or the 
fraction is near the extremes of or 1. For example, suppose an urn is filled with marbles, an unknown fraction 
of which are red and the rest white. A sample of 5 marbles is taken (with replacement), and in that sample 
all marbles are white. What is the fraction and 80% confidence interval of red marbles? The standard method 
yields a fraction of zero and a confidence interval of zero. These are obviously wrong. A sample of 5 whites 
indicates that the fraction of reds is probably small, but it certainly provides no assurance that it is zero; instead, 
it is very likely to be nonzero. 

We will describe another method, which we will call the "exact method," that does not have these errors. 
There are many practical cases where small sample sizes are important. For example, a medical trial may involve 
just a dozen or so patients. It can be useful to use the exact method for such studies. Indeed, in some cases 
it might be considered irresponsible not to use it, since the standard method could lead to incorrect decisions 
based on misleading results. 

This problem arose when the authors needed an estimate and confidence interval for the probability of a rare 
event that may occur less than a dozen times, or even never, in a sample of several billion. ,2 After the standard 
method yielded obviously wrong results, the authors were surprised that a literature search did not yield the 
exact method. The purpose of this note is to document it for general use. 

The next section describes the assumptions and detailed derivation of the exact method. The reader who 
just wants to see the final result may refer to Eqs. HI, ([2]), and ([3]) below, which show the exact method's mean, 
lower confidence level, and upper confidence level respectively. 
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2 The exact method 

The primary example we will use in our development is the biased coin problem, which is equivalent to a finite 
sample from an infinite population. Formally, the problem is to estimate an unknown probability of success p in 
a Bernoulli proeess, knowing only that there were m successes in an experimental run of n trials. 

The equivalent problem for finite populations is sampling with replacement, where each sample is put back 
into the the population pool so that it will have an equal chance of being drawn again. 

Let p be the unknown probability of heads for the biased coin. We will assume that p is uniformly distributed 
between and 1, that is, all values of p between and 1 are equally likely. This seems to be a reasonable 
assumption absent any other information. 

We will first look at the case where p has k discrete values between and 1. This will let us study the 
problem with simple examples in order to understand the sample space intuitively. (The discrete case can also 
stand on its own as a useful result when whenever the probabilities actually are discrete.) Once we derive the 
result for arbitrary k, we can take the limit as — > oo to obtain the exact result for a continuously distributed 
P- 

We will call the discrete values of p hy pi,p2, ■■■ ,Pk' 

pi = i(l - i) (representing < p < i) 

11 12 
P2 = t(2 — -) (representing -fc < p < -) 

K ^ K K 

11 fc — 1 

Pfc = -^(fe - -) (representing ^ <p< 1). 

Given a coin with probability Pi {1 < i < k) of heads, the probability of tails is 1 — pi. The probability of 
a specific finite sequence beginning (for example) head, tail, head, head tail,. . . is pi(l — Pi)piPi{l — pi) . . .. The 
probability of obtaining a specific sequence of n tosses containing m heads is thus 

pTii-PiT-^. 

There are = ^,f^'i^y ways of obtaining m heads out of n tosses. Thus the probability of exactly m 
heads in n tosses is 



pr(i-Pi)"- 



To motivate the main argument, consider the simple example where k = 2. We have: 

1 3 

Suppose we perform a large rmniber t of trials (which we can later take to infinity — actually, t will cancel in 
the final result), say t = 1, 000, 000, each with n tosses, for a coin with probability pi and also for a coin with 
probability p2. The expected number of n-toss trials resulting in m heads will be 

<?i + g2 

where 

gi=t(" Ipr(l-Pi)'^ 

92 = M ^ \P2{'^-P2) 

Thus for any particular n-toss trial with m heads, the probability that it came from the pi coin is 

~ ^^Ti/l ^ \n — m 

g ^ gi ^ Pi (1 -pi) 

^ (?i + «?2 p5"(i -pi)"-™ +p^(i -P2)"-'" 

and similarly for the p2 coin. As an example, for n=5 and m=l, we have 

ei ^ 0.964, 62 ~ 0.036. 

This means that if we know that a coin has an unknown probability of heads of either | or | , and we observe 1 
head in a 5-toss sample, 96.4% of the time the coin's probability will be \. 
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Going back to the general case, the expected probabihty of a coin with head probabihty pi, I < i < k, based 
on a sample of n tosses where m heads are observed, is 

„m / 1 \n — m 

^ _ Pi (1 -pQ 



The mean expected probability is computed in the standard way: 

and the confidence interval can be computed (say with a computer algorithm) from the distribution d. The 
number of intervals k can be made as large as desired for sufficient accuracy. 

We take the limit as /c — >■ oo to obtain the exact (continuous) probability density e{x) for head probability 
x,0<x <1: 

, , x^iX-xY"" x^(\-xT~"" 
e(x) — — 



where B(i,j) = y' ^(1 — yY ^dy is the beta function. 

The exact expectation value of the distribution e{x) is then 



E[e{x)] = 



y"'{l - y)"'"^dy B{m + l,n - m + 1) 

s the beta function, 
e distribution e{x) is then 

Jo a;[x'"(l-x)"-'"]da; _ B{m + 2,n - m + 1) 



B{m + l,n — m + 1) B{m + l,n — m + 1) 

It can be shown that the last ratio evaluates to . Thus we have a surprisingly simple formula for the expected 
probability of heads for the biased coin, 

m + 1 



(1) E[e{x)] = 



n + 2 



(This compares to the corresponding standard method expectation ^, showing the two are nearly the same for 
large n and m.) 

The confidence interval is a little harder to compute. For a confidence interval of c • 100%, we need to find 
xi and X2 such that the cumulative distribution of the probability density e{x) equals |(1 — c) and |(1 +c), for 
example 0.1 and 0.9 for an 80% confidence interval. 

-1 j;'x"^{i-xr-"^dy 1 

e{x)dx = -^f- = -(1 — c) 

^ ^ B{m+l,n-m+l) 2^ ' 

<^)dx = ^» , \ ^ ^ ' = i(l + c). 







B{m+l,n-m+l) 2 



The integrals can be expressed with regularized incomplete beta functions Ix^ (m + 1, — m + 1) and ("t- + 
1, 71 — 771 + 1), so obtaining the confidence interval amounts to solving the two equations 

/2.i(m + 1, n - m + 1) = i(l - c) 

7^2(771+ 1,71- m+ 1) = i(l + c)) 

for xi and X2- The solutions can be expressed as inverse regularized incomplete beta functions: 

(2) Xl = (771 + 1, 71 — 771 + 1) 

(3) a;2 =7i^^^^^j(771 + 1,71 - 771 + 1) 

These can be evaluated using, for example, a computer algebra systemQ 

Example 1. For the marble problem described in Sec. [Tl we have c = 0.8, 771 = 0, 71 = 5. The exact method 
shows that the 80% confidence interval is between x\ « 0.017 and X2 ~ 0.319, with a mean from Eq. [1] of 
i ~ 0.143. This is very different from the (incorrect) zero confidence interval and zero mean that the standard 
method yields. 

Example 2. For the voting problem described in Sec. [T] we have c = 0.95, m = 917, 71 = 1600. With the 
exact method, the mean is « 0.573 with a 95% confidence interval between xi ^ 0.549 and X2 ~ 0.597. The 
standard method yields mean 0.573 and 95% confidence interval between 0.548 and 0.598, showing that the two 
methods approximately agree when the sample size is large. 



^/i_p(n ~ k, k + I) is the well-known cumulative distribution function for the number of successes k for the binomial distribution of 
n trials from a Bernoulli process with a known probability p of success. Because this problem has frequent applications, most computer 
algebra systems provide the regularized incomplete beta function and its inverse. 
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3 Conclusion 

The formulas for the exact method are nearly as simple to state as those for the standard method. But they 
have the significant advantage of being exact rather than approximate, with no errors when sample sizes are 
small. 

In the exact method, the derivation from first principles is straightforward and rigorous, with all assumptions 
clearly laid out. This contrasts to the standard method, which involves the mathematically questionable (or at 
least not rigorously justified) bootstrapping procedure as well as the implicit use of Gaussian distributions to 
approximate non-Gaussian ones. The errors involved in these approximations, as well as their their regimes of 
validity, are difficult to determine and typically glossed over. Regarding bootstrapping, the authors say merely 
that "the estimate is good when the sample is reasonably large" even though the procedure "may seem crude." 
[D p. 342] 

It is not clear why the exact method isn't mentioned in most textbooks or, indeed, why it isn't universally 
used instead of the standard method. Apparently the exact method is not well known. 
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